Graph Based Classification of Content and Users in BitTorrent
نویسندگان
چکیده
P2P downloads still represent a large portion of today’s Internet traffic. More than 100 million users operate BitTorrent and generate more than 30% of the total Internet traffic [7]. Recently, a significant research effort has been done to develop tools for automatic classification of Internet traffic by application [9, 8, 11]. The purpose of the present work is to provide a framework for subclassification of P2P traffic generated by the BitTorrent protocol. Unlike previous works [9, 8, 11], we cannot rely on packet level characteristics and on the standard supervised machine learning methods. The application of the standard supervised machine learning methods in [9, 8, 11] is based on the availability of a large set of parameters (packet size, packet interarrival time, etc.). Since P2P transfers are based on the same BitTorrent protocol we cannot use this set of parameters to classify P2P content and users. Instead we can make use of the bipartite user-content graph. This is a graph formed by two sets of nodes: the set of users (peers) and the set of contents (downloaded files). From this basic bipartite graph we also construct the user graph, where two users are connected if they download the same content, and the content graph, where two files are connected if they are both downloaded by at least one same user. The general intuition is that the users with similar interests download similar contents. This intuition can be rigorously formalized with the help of graph based semi-supervised learning approach [13].
منابع مشابه
Detection of Fake Accounts in Social Networks Based on One Class Classification
Detection of fake accounts on social networks is a challenging process. The previous methods in identification of fake accounts have not considered the strength of the users’ communications, hence reducing their efficiency. In this work, we are going to present a detection method based on the users’ similarities considering the network communications of the users. In the first step, similarity ...
متن کاملFakeDetector: A measurement-based tool to get rid out of fake content in your BitTorrent Downloads
Fake content represents an important portion of those files shared in BitTorrent. In this paper we conduct a large scale measurement study in order to analyse the fake content publishing phenomenon in the BitTorrent Ecosystem. Our results reveal that a few tens of users are responsible for 90% of the fake content. Furthermore, more than 99% of the analysed fake files are linked to either malwar...
متن کاملIdentification and Classification of Desirable Web-Based Services from the Perspective of Website Users of Iran’s Hospitals Based on Kano Model of Customer Satisfaction
Background and Aim: A hospital website is an appropriate system for exchanging information and connecting patients, hospitals and medical staff. The purpose of this study was to identify and classify desirable web-based services in websites of Iran's hospitals based on Kano’s Customer Satisfaction Model. Materials and Methods: This was a survey study. The statistical population of the study co...
متن کاملTorrentGuard: Stopping scam and malware distribution in the BitTorrent ecosystem
In this paper we conduct a large scale measurement study in order to analyse the fake content publishing phenomenon in the BitTorrent Ecosystem. Our results reveal that fake content represents an important portion (35%) of those files shared in BitTorrent and just a few tens of users are responsible for 90% of this content. Furthermore, more than 99% of the analysed fake files are linked to eit...
متن کاملBittella: A Novel Content Distribution Overlay Based on Bittorrent and Social Groups
This paper presents Bittella: a new social network for content distribution based on Peer-to-Peer technologies. It exploits the common interests of the users in order to create social groups based on an algorithm called Ranking Algorithm. On the other hand, Bittella is deployed over a semantic-search based and unstructured p2p network, in spite of this it uses Bittorrent-like download technique...
متن کامل